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Abstract. 

Very  few  theoretical  results  have  been  obtained  to  date  about  the 
behavior  of  information  retrieval  algorithms  under  random  deletions, 
as  well  as  random  insertions.  The  present  paper  offers  a possible 
explanation  for  this  dearth  of  results,  by  showing  that  one  of  the 
simplest  such  algorithms  already  requires  a sirrprisingly  intricate 
analysis.  Even  when  the  data  structure  never  contains  more  than 
three  items  at  a time,  it  is  shown  that  the  performance  of  the  standard 
tree  search/insertion/deletion  algorithm  involves  Bessel  functions  and 
the  solution  of  bivariate  integral  equations.  A step-by-step  expository 
analysis  of  this  problan  is  given,  and  it  is  shown  how  the  diffic\ilties 
arise  and  can  be  surmounted. 
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i. 


Introduction. 


An  algorithm  known  as  "tree  search  and  insertion"  has  become  one  of 
the  most  commonly  used  methods  for  maintaining  a dynamically  growing 
dictionary  or  symbol  table  (see  [ 5 ])•  This  algorithm  was  discovered 
independently  by  several  people  during  the  1950' s,  and  in  I962  Thomas  N. 

Hibbard  [1]  showed  that  entries  could  also  be  deleted  dynamically  without 
difficulty.  At  that  time  Hibbard  proved  one  of  the  first  results  that 
might  be  called  a theorem  of  "pure  computer  science",  because  it  was  one 
of  the  first  results  ever  to  be  proved  about  data  structure  manipulations: 

He  showed  that  a random  deletim  from  a random  tree,  using  his  algorithm, 
leaves  a random  tree.  Although  the  statement  may  seem  self-evident  when 
stated  in  this  way,  it  was  in  fact  a surprising  result,  because  the 
deletion  algorithm  was  necessarily  asymmetric  while  random  trees  are 
symmetric.  Hibbard's  theorem  can  be  stated  more  precisely  as  follows: 

"If  n+1  items  are  inserted  into  an  initially  empty  binary  tree,  in 
random  order,  and  if  one  of  these  (selected  at  random)  is  deleted,  the 
probability  that  the  resulting  binary  tree  has  a given  shape  is  the  same 
as  the  probability  that  this  tree  shape  would  be  obtained  by  inserting 
n items  into  an  initially  empty  tree,  in  random  order."  It  took  great 
foresi^t  even  to  conjecture  such  a result  in  I96?;  people  rarely  proved  things 
about  computer  programs  in  those  days,  unless  perhaps  numerical  analysis  was 
involved,  and  binary  trees  were  not  well  understood.  Furthermore,  the 
proof  was  not  simple. 

Ten  years  later,  Gary  D.  Knott  proved  a much  deeper  result  [ 2 ] : 

If  n items  are  inserted  into  an  initially  empty  binary  tree,  in  random 
order,  and  if  the  first  k items  inserted  are  subsequently  deleted  by 
Hibbard's  algorithm,  in  the  same  order  as  they  were  inserted,  the  resulting 
binary  tree  is  random,  (in  other  words,  the  probability  that  the  resulting 
tree  has  a given  shape  is  the  same  as  the  probability  that  this  shape  of 
tree  would  be  obtained  if  n-k  items  had  been  inserted  into  an  initially 
empty  tree  in  random  order. ) The  theorems  of  Hibbard  and  Knott  seemed  to 
settle  the  question  of  deletions,  since  they  proved  stability  of  the  tree 
distribution  under  a wide  variety  of  deletion  disciplines.  •— 
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However,  Knott  also  discovered  a surprising  paradox:  Althou^ 

Hibl);u’d's  theorem  establishes  that  n+1  random  insertions  followed  by 
a random  deletion  produces  a tree  whose  shape  has  the  distribution  of 
n random  insertions,  it  does  not  follow  that  a subsequent  random 
insertion  yields  a tree  whose  shape  has  the  distribution  of  n+1  random 
insertions i For  ten  years  it  had  been  believed  that  Hibbard's  theorem 
proved  the  stability  of  the  algorithms  under  repeated  insertions  and 
deletions  (cf.  [ 1 ],  p.  25,  and  [3  ],  first  printing,  pp.  129-^52); 
the  discovery  of  a subtle  fallacy  in  this  reasoning  therefore  came  as 
a shock. 

In  order  to  understand  the  paradox,  we  need  to  know  only  what 
Hibbard's  algorithm  does  to  binary  search  trees  with  three  elements  or 
less.  The  five  binary  search  trees  on  three  elements  x < y < z are 


A(x,y,  z)  B(x,y,  z)  C(x,y,  z)  D(x,y,  z)  E(x,y,  z) 

A 


\y 


and  the  two  possibilities  on  two  elements  x < y are 


F(x,y)  G(x,  y) 


^ The  standard  insertion  algorithm  produces  the  following  binary  search  tree 

when  inserting  element  z into  a tree  containing  x and  y : 


In  other  words,  z is  simply  attached  "at  the  bottom"  where  it  fits. 
Hibbard's  deletion  algorithm  operates  as  follows  on  a 3 -element  tree: 


Initial  tree 

Delete  x 

Delete  y 

Delete  z 

A(x,y,  z) 

F(y, z) 

F(x,z) 

F(x,y) 

B(x,y,  z) 

F(y,z) 

F(x, z) 

G(x,y) 

C(x,y,  z) 

G(y^  z) 

F(x,z) 

F(x,y) 

D(x,y,  z) 

G(y^  z) 

G(x, z) 

G(x,y) 

E(x,y,  z) 

G(y,  z) 

G(x, z) 

G(x, y) 

If  we  insert  three  elements  x < y < z in  random  order,  we  get  a tree 
of  shape  A , B , C , D , E with  the  respective  probabilities  1/-  , l/o  , 
2/6  , 1/6  , 1/6  ; then  a random  deletion  leaves  us  with  the  following 
six  possibilities  and  probabilities: 


F(x,y) 

F(x,z) 

F(y,  z) 

G(x,y) 

G(x, z) 

G(y, z) 

3 

4 

2 

5 

2 

4 

iH 

15 

15 

15 

9 1 

The  probability  of  shape  F at  this  point  is  , In  accord  with 

Hibbard's  theorem. 

But  now  comes  another  random  insertion,  say  w . The  probability 
is  1/4  that  w is  the  smallest  of  [w,x,y,  z]  ; and  the  other  three 
cases  x<w<y<z,  x<y<w<z,  x<y<z<w  also  occur  with 
probability  1/4  . Thus  the  tree  F(x, y)  becomes  A(w, x,  y)  , B(x, w, y) 
or  C(x, y,  w)  with  respective  probabilities  l/4  , 1/4  , 1/2  ; and  the 
other  cases  F(x,  z), . . .,  G(y,  z)  can  be  worked  out  similarly.  We  find 


that  the  insertion  of  w produces  a tree  of  shape 


with  the  respective  probabilities 
3+4+4  6+2+4 


72 

(1.1) 


Tc 


72 


72 


namely 

’ 72  ' 


n 

72 


3+4  + 4 
72  ’ 


72 


3 + 8 + 2 
72  ’ 


B , C , D , E 
6+4+2+3+2+S 


72 


A random  deletion  now  produces  a tree  of  shape  F with  probability 

ii+2  13  + 2 25_109  1 

72  5 * 72  3 ‘ 72  “ ^ 2 

A study  of  this  example  shows  where  the  fallacy  occurred:  The 

"random"  tree  shape  was  not  independent  of  the  "random"  values  remaining. 

For  example,  when  x is  deleted  (relatively  large  values  remaining),  the 
tree  tends  to  be  of  shape  G , but  when  z is  deleted  (relatively  small 
values  remaining)  the  tree  shape  is  not  biased  towards  F or  G . 

Fortunately  the  deviation  from  randomness  occurs  in  the  right  direction 
here:  the  trees  actually  tend  to  get  better,  in  the  sense  that  the 

balanced  shape  C (which  requires  less  search  time)  becomes  more  probable. 
Extensive  empirical  studies  by  Knott  [ 2 ] give  overwhelming  support  to 
the  conjecture  that  random  deletions  do  not  degrade  the  average  search 
time;  but  no  proof  has  yet  been  found. 

More  precisely,  Knott's  conjecture  is  this:  Consider  a pattern  of 

n+k  insertions  and  n deletions,  in  some  order,  where  the  number  of 
deletions  never  exceeds  the  number  of  insertions.  For  example,  one  of 
the  patterns  with  n = U and  k=4  is  IIIDIIDIIIDD. 

To  do  each  insertion,  put  a new  random  element  into  the  tree,  say  a 
uniform  random  mmiber  between  0 and  1 ; to  do  each  deletion,  choose  a 
random  element  imiformly  from  among  those  present.  All  of  these  random 
choices  are  to  be  independent.  Then  for  each  fixed  pattern  of  I's  and 
D ' s,  the  average  path  length  of  the  resulting  tree  is  conjectured  to  be 
at  most  equal  to  the  average  path  length  of  the  pattern  consisting 
solely  of  k I's. 

In  attempting  to  explore  this  conjecture,  it  is  natural  to  investigate 
the  simple  case  of  patterns 

III  , IIIDI  , IIIDIDI  , ...  , III(DI)'^  , ... 

for  k = 3 • Such  patterns  never  require  us  to  deal  with  more  than  three 
elements  in  the  tree  at  any  time;  so  all  we  must  do  is  study  the  following 
trivial  procedure. 


1.  Let  X , y be  independent  ^^nlform  random  numbers.  Insert  x into 
an  empty  tree,  then  insert  y . (if  x < y , we  get  the  tree 
G(x,y)  , otherwise  we  get  F(y,x)  .) 

2.  Insert  a new  independent  \miform  random  number  into  the  tree. 

5.  Choose  one  of  the  three  elements  in  the  tree  at  random,  each  with 

equal  probability,  and  delete  it  using  Hibbard's  method. 

I4.  Return  to  step  2. 

At  the  beginning  of  the  (n+1)  -st  occurrence  of  step  5,  we  have  a 
tree  of  shape  A,B,C,D,or  E,  with  certain  probabilities 
®'n  ^ ^n  ^ ^n  ' *^n  ^ ®n  ’ want  to  show  that  these  probabilities  approach 
a "steady  state."  According  to  the  conjecture,  c^  should  be  > I/5  , 
because  only  shape  C has  a path  length  smaller  than  the  other  shapes. 

The  first  two  times  we  get  to  step  3,  we  have  seen  that  (a^,  ...,e^) 

are  respectively  ^ ^ ^ | ^ and 

What  do  these  probabilities  look  like  after  n deletions  have  been  made, 
for  large  n ? This  is  the  problem  we  shall,  investigate  in  the  remainder 
of  the  paper. 

It  turns  out  that  this  problem  is  not  as  simple  as  it  ml^t  appear 
at  first,  in  spite  of  the  triviality  of  the  algorithm;  in  fact,  the 
analysis  ranks  among  the  more  difficult  of  all  exact  analyses  of  algorithms 
that  have  been  carried  out  to  date,  although  it  is  "elementary"  in  the 
sense  that  no  deep  theorems  of  analysis  are  required.  From  the  form  of 
the  answer  we  shall  derive,  it  will  be  clear  that  the  problem  itself  is 
intrinsicalTy  diffic\LLt  --  no  really  simple  derivation  woiild  be  able  to 
produce  such  a coii^jlicated  answer,  and  the  answer  is  right;  Since  the 
difficulties  we  will  encounter  are  interesting  and  instructive,  an  attemj-t 
has  been  made  to  present  the  solution  here  in  a motivated  way,  explaining 
how  it  was  fo\md,  instead  of  simply  to  present  a polished  proof. 
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2.  The  Recurrences  to  be  Solved. 

Tlie  behavior  of  the  trivial  algorithm  depends  only  on  the  relative 
order  of  the  elements  inserted,  and  the  paxticular  choice  made  at  each 
deletion  step.  Therefore  one  way  to  analyze  the  situation  after  the 
pattern  is  to  consider  (r+3)I3'^  configurations  to  be  equally 

likely,  reflecting  the  relative  order  of  the  n+3  elements  inserted  and 
the  n 3-way  choices  of  which  element  to  delete.  For  example,  when 
n = 1 there  are  72  equally  likely  possibilities,  and  our  analysis  of 
this  case  in  (l.l)  essentially  considered  them  all. 

However,  such  a discrete  approach  leads  to  great  complications.  The  j 

following  continuous  approach  which  follows  the  algorithm  more  closely 
turns  out  to  be  much  simpler;  Let  f^(x,y)dxdy  be  the  differential 
probability  that  the  tree  is  F(X,  Y)  at  the  beginning  of  step  2,  after 
n elements  have  been  deleted,  where 

X < X < X + dx  and  y < Y < y + dy  ; 

and  let  g^(x, y)dxdy  be  the  corresponding  probability  that  it  is  G(X,  Y)  . 

Let  a^(x, y,  z)dxdydz,  . . .,  e^(x, y,  z)dx;dydz  be  the  respective  probabilities 
that  the  tree  is  A(X,  Y,  Z), . . .,  E(X,  Y,  Z)  at  the  beginning  of  step  3, 
for  some  x<X<x  + dx,  y<Y<y  + dy,  z<Z<z  + dz.  Then  it  is 
possible  to  write  down  recurrence  relations  for  these  differential 
probabilities  by  directly  translating  the  algorithm  into  mathematical 


formalism, 

, First  we 

have 

(2.1) 

aj^(x,y,  z) 

= z)  , 

= ' 

y,  z) 

= + g^(y^z) 

= , 

y,  z) 

= gj^(x,y)  , 

for  0<x<y<z<l  , 

by  considering  the  six  possible  actions  of  step  2.  (These  probabilities  are, 
of  course,  zero  when  x<0,x>y,y>z  or  z>l;at  the  boundaries 
x=0,  x=y,  y=z,  and  z = 1 there  may  be  discontinuities,  and  it  does 


i 

i 

! 


not  matter  how  we  define  the  functions  there.  Secondly  we  have 


(2.2)  f (x,y)  = 


3 Jq  +\('t,x,y))dt 


1 

+ j/  "t)  + Cn(x,y,t))dt  , 


gn+i(^^y)  = 3 Jq  (Cn(t,x,y)  + d^(t,x,y)  + e^(t,x,y))dt 

1 ^ 

+ rj  (dj^(^^'t,y)  + e (x,t,y))dt 


+ 5/  (\('^»y»t)+'in(x,y,t)  + en(x,y,t))dt  , 

for  0 < X < y < 1 , 

by  considering  the  possible  actions  of  step  3.  Inserting  (2.1)  into  (2.2) 
and  applying  obvious  simplifications  yields  the  fundamental  recurrences 

(2.3)  = 5 ( + / f (x,t)dt 

0 X 


/I  1 \ 

+ f gjt,y)  + f f^(y,t)dt  + / g,(y,t)dt 

X y Y J 


gn+i(^^y)  - 3 ^g^(x,y)  + f^(t,x)dt  + g^(t,y)dt 

X 1 1 \ 

+ J g^(t,x)dt  + / g^(x,t)dt  + / f^(x,t)dt  1 , 


for  0 < X < y < 1 


Con^^ideration  of  step  1 also  leads  to  the  obvious  initial  conditions 
(2.h)  fgCxjy)  = gQ(x,y)  = 1 , for  0<x<y<l  . 
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We  have  now  transfonned  the  algorithm  mechanically  into  a cet  of 
equationc  that  precisely  describe  the  distribution  of  its  behavior.  The 
quantities  of  interest  to  us  are 


(2.5) 


1 z y 

/ f I a (x,y,  z)dxdydz  , . . . , 


0 0 0 


1 y 

®n  = J / J e (x,y,  z)dxdydz  , 
0 0 0 


namely  the  respective  probabilities  that  a tree  of  shape  A,  ...,E  occurs 
after  the  insertion/deletion  pattern  IIl(Dl)^  ; and 


1 y 1 y 

(2.6)  " f J ^ ®n  / gj^(x,y)dxdy  , 


0 0 


0 0 


the  probabilities  that  the  tree  shape  is  F or  G after  the  pattern  Il(lD)’^ 


Hibbard's  theorem  for  trees  of  size  2 states  that  f_  = f.,  and  g = g . 


1 
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5.  Simplification  of  the  Recurrences. 

Wliat  can  we  do  with  such  formidable  recurrences  (2.5)-  (2.t)V 
In  the  first  place  we  can  look  for  invariant  relations  that  might  be 
used  to  simplify  them. 

When  the  algorithm  reaches  step  2,  it  is  clear  that  the  two  numbers 
X and  Y in  its  tree  are  random,  except  for  the  condition  that  X < Y . 
Thus  we  must  have 

(5.1)  f^(x,y) + g^(x,y)  =2  , for  0<x<y<l  and  n>0  . 

(it  is  2 , not  1 , since  the  probability  that  x < X < x + dx  and 
y < Y < y+dy  given  that  X < Y is  2dxdy  .)  This  formula  could  also  be 
proved  directly  from  (2.5)  and  (2.U),  by  induction  on  n . 

Relation  (5.1)  means  that  we  really  have  caily  one  function  to  worry 
about,  namely  fj^(x,y)  . Let  us  rewrite  (2.5)  and  (2.b)  to  take  account 
of  this  fact: 

(5.2)  fQ(x,y)  = 1 

= 

for  n > 0 

Henceforth  we  shall  avoid  mentioning  the  condition  0 < x < y < 1 , 
for  if  we  use  (5.2)  to  define  f^(x, y)  for  a.1 1 x and  y it  will 
agree  with  the  true  f^(x,  y)  when  0 < x < y < 1 . 

We  have  obtained  a much  simpler  reciu-rence  than  (2.5)-(2.1),  but  (5.2) 
still  has  some  undesirable  features.  Before  proceeding  any  further,  we 
can  use  (5.2)  to  check  what  we  have  done  so  far,  by  computing  the  first 
few  's: 

= l-  |x+iy,  f^ 

f^(x,y)  = i-|x+|y+^  (x-y)^  , f^ 

Good. 


2-2x+f^(x,y)  + / 


fj(x,t)dt 


10 


r-iirrm 
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7 


We  are  hoping  that  the  process  converges  for  large  n , and  in  this 
case  the  limiting  distribution  f (x, y)  will  have  to  satisiV  the  integral 


equation 


■.y)  = -2x+ f^(x,y)  + j'  f^(t,y)dt  + J f_^(x,t)dt"^  , 


(3-3)  f (x 


Before  going  on  to  find  a solution  to  this  equation,  let  us  verify  that 
f^(x,y)  will  indeed  converge  to  f^(x,y)  if  f^(x,y)  exists;  Subtracting 
(5«5)  from  (3-2)  yields 


Vl(x,y)  = j^r^(x,y)  + / r^(t,y)dt  + J r^(x,t)dt"^  , 


where  r^(x,  y)  = f^(x,y)  -f_^(x,y)  . Now  if  |r^(x,  y)[  < a for  0 £ x < y < 1 , 
we  will  have 


IVi(^- 


>y)l  - 


a + ^ adt  + j adt 

0 X 


y 2 

- a < - a 
•“  5 


Therefore  if  f^(x,y)  exists,  so  that  rQ(x,y)  is  bounded,  the  remainder 

= 0((2/3)^)  converges  rapidly  to  zero,  regardless  of  the  initial 
distribution  fQ(x,  y)  . 

It  remains  to  determine  f (x,  y)  , whose  defining  equation  (5 -3)  can 


be  rewritten 


(3 A)  f (x,y)  = 1 - X + 


X y \ 

j f„(t,y)dt  + f f (x,t)dt  1 . 

0 X / 


The  coefficient  1/2  can  be  removed  from  this  relation  by  letting 


q(x,y)  = f (2x,2y)  , 


so  that 


y 

(3.5)  q(x,y)  = l-2x  + ^'  q(t,y)dt  + f q(x,t)dt 

0 X 

What  is  this  function  q(x, y)  ? (it  is  suggested  that  the  reader  might 
enjoy  trying  to  find  it  before  reading  on. ) 


!t . Solving  the  Integral  Kquation. 

In  attempting  to  solve  (3.5  )j  perhaps  the  first  thing  we  might  try 
is  differentiation.  Let  q'(x,y)  = dq(x,y)/Sx  , and  q,(x,y)  = dq(x,y)/dy  ; 
then 

y 

(l+.l)  q'(x,y)  = -2+q(x,y)+f  q' (x,  t )dt  - q(x,  x)  , 


(h.2)  q,  (x,y)  = ^ q,  (t,y)dt+  q(x,y)  , 

'0 


(4.5)  q’,(x,y)  = q,(x,y)  -f  q' (x,y)  . 


If  we  postiilate  that  q has  a power  series  expansion 


m n 

(L.L)  q(x,y)  = E q^  ^ ^ , 

m,n  >0 


we  find 


m n 
X y 


(4.5)  q'(x,y)  = E q^^  ^ , q,(x,y) 

in,n  >0  ^ 


m n 
X y 


Sn,  n+l  ml  nl  ’ 
m,  n > 0 ' 


q'.(x,y)  = E Vl,n+1  tril 

n ^ 0 


Therefore  (4.3)  yields  the  simple  relation 

Vl,n+1  = 'V.n+l"  Vl,n  ’ ^ ° > 

from  which  it  is  possible  to  determine  all  the  q^  ^ terms  of  the 
boundary  values  q.,  and  q__  . . 

Setting  X = 0 'n  (5.5)  yields 

(4.7)  q(0,y)  = i ^ / q(0,t)dt  , 

0 

hence  q(0,  y)  = e^  and 


(4.8)  q^  ^ = 1 , for  n > 0 . 


■rmf!.-  -w 


Now  oomoa  a tricky  man ij^ulation,  which  was  found  while  playing 
ai’outid  trying  to  determine  q(x,  0)  . If  we  apply  (i+.l)  with  x and  y 
interchanged,  and  add  the  two  results,  we  get 

q’ (x,y)  q' (y,x)  = + q(x,  y ) + q(y,  x)  - q(x,  x)  - q(y,  y ) 

J 

+ J (q'  (x,t)  - q'  (y,t))dt 

X 

y y 

= / (q' (t,x)  - q' (t,y))dt  + J (q' (x,  t)  -q' (y,  t))dt 

X X 

Let  s(x,y)  be  the  symmetric  function  q' (x, y)  + q' (y, x)  ; we  have  just 
proved  that 

y 

(U.9)  s(x,y)  = (s(x,t)  - s(y,t)  )dt  . 

X 

But  this  equation  implies  that  s(x,  y)  = -1  1 Let 


m n 

X y 


(-.10)  s(x,y)  = D s ^ , 

^ ‘ ^ ^ ^ m,  n ml  n;  ’ 

m,n  >0 


%i+l,  n ^+1,1 


The  coefficients  s for  m+n  = k > 0 on  the  left-hand  side  of  (^•«9) 
in.^  n 

all  arise  as  homogeneous  linear  combinations  of  the  coefficients  s 

m,  n 

for  rtu-n  = k-1  , since 


I-  / m.n  m,n\j,  / m n-rx  , xn-x  ill  lirrilTX  W O Jirru-rx..  II  N 

J (xt-yt)dt  = (xy  +x  y-x  y-xy  )/(n+l)  ; 


m n+1  , n+1  m itUn+l  0 0 ratn+lv 


hence  we  can  prove  by  induction  on  k that  s =0  whenever  nH-n  = k > 0 
It  follows  that 

Vl,n  = 'Vl,m  ' > 0 mtn  > 0 . 

Wlien  m = n = 0 we  have  -4  = ^ = q^  q ■*"  q^  q > hence  q^^  ^ = -2  ; 

relations  (4,b)  and  (4.8)  imply  that  q,  = n-2  for  all  n > 0 , and 

n — 

(4.11)  with  n = 0 yields 

(4.12)  q„  r-  = "qi  1 = 5-m  for  m > 2 

mi,  0 ^1,  m-1  - 


We  have  found  the  desired  boundary  conditions,  and  it  rem.ains  to  deduce 
the  genral  formula  using  (4,'  ).  The  binomial  coefficient 


5T'* 


•) 

s 

i 

J 

) 

( 

1 

V 

V 

i 

1 


( 


nri-n+a 
m+b  J 


satisfies  (4.6)  for  all  integers  a and  b , so  it  suffices  to  find  a 

linear  combination  of  these  binomial  coefficients,  subject  to  the 

condition  that  the  known  values  of  q are  obtained  wiienever  m = 0 

^,n 

or  n = 0 . The  solution  in  this  form  is  not  unique,  because  of 
identities  between  binomial  coefficients;  probably  the  most  elegant 
way  to  express  it  is 

(^•«)  v„  - ("r) -(r?)  ■ 

Our  derivation  has  proved  that  ^ must  have  this  value  if  the  p;ower 
series  q(x, y)  postulated  in  (4.4)  satisfies  (5.5).  Conversely,  it  is  clear 
that  a power  series  solution  to  (5.5)  exists,  since  the  set  of  values 
q^  ^ with  mfn  = k defines  the  set  of  values  with  m+n  = k+1  after 
integration.  Therefore 


(4.14) 


l(x,y) 


D 

m, n >0 


f f m+n-5  \ f nd-n-5  ^ 

vV  m y ^ m-4  J J ml 


n 

nl 


solves  (5.5).  Note  that  I'lui  > hence  the  power  series  is 

absolutely  convergent  for  all  x,  y , and  (4.l4)  is  the  only  power  series 
solution. 

Finally  let  us  try  to  express  q(x,  y)  in  terms  of  simpler  functions, 
possibly  even  "known"  ones.  The  following  somewhat  surprising  identity 
is  especially  useful  for  functions  of  this  type: 
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/ m+ri-ta  A ^ ^ 

" V J n: 


m,  a >0 


j,k,m7n>0  “ ' \ J r-^'-  kin; 


M N 

>'  — iL-  Y C 1 V 

k-*  Vll  MI  ^ ^ -“-J 

M,N>0  j,k>0 


^ M Y N \ /^  fi^N-J-k+a  \ 

UAkA  M-j+b  ) 


M N 

Y — Z_  V \M+t)+k 

^ M'  N'  ^ \ “-*-/* 

M,N>0  j,k>0 


^ M j ^ N j ^ 


-N+k-a+b-1 

M-j+b 


M N ^ , 

V X y V / 1 \M+b+k 

M,N>0  k>0 


C0( 


M- N+k-a+b-1 
Ivtfb 


M N 

^ M'  N'  ^ 

M,  N>0  • 


M-N+b 


M N ^ X 

y iL  f a > 
MI  NI  I M-N+b  J 


M,R  -0 


When  M-N  has  a fixed  value,  the  terms  of  this  sum  are  readily  exj^ressed 
in  terms  of  modified  Bessel  functions  of  the  first  kind,  defined  as  usual 
by  the  forr.ula 

2k+r 

+(-)  • • 

k _ - r 

For  examjle,  if  a > 0 all  terms  vanish  except  those  for  0 < M-N+b  < a , 
hence  (^*.15)  reduces  to  a finite  sum 


M N 


^ f a "j  21  iL 
r V J K.N  .0  “•  "• 


-r  \r-b 


M,  n _ 0 
fi+b  = :N  r 


Ori  t.he  other  hand,  il'  a ■-  0 (as  it  unfortunatexy  io  in  our  case),  another 
i'uriction  L:;  aijarently  required. 


Lot  h(x, y)  be  the  double  power  series 


m n 

(J+.l?)  L ^ ^ 

ra>n>0 


wiiich  converges  absolutely  for  all  x and  y . We  have 


nH-n  n 

(L.18)  h(x,y)  = E ^ ^ = E 

m,n>0  • • m>0 


Furthermore 


(1..19)  h(x,y)  = e^  E 1-  ^ r e-H'^dt 

m>0  V 0 


x+y  y / -t 
= e - e'^  J e 


. m m \ 

E , ] dt 

m>o  y 


= - e^  / e-^  I^2VTI)dt  , 

0 

so  h(x,y)  can  be  expressed  in  at  least  two  ways  in  terms  of  Bessel 
functions;  but  it  does  not  seem  to  have  any  simpler  expressions  in  "closed 
form".  The  definition  of  h(x,y)  is  already  sufficiently  simple  that  we 
can  consider  it  a known  function;  we  will  express  q(x, y)  in  terms  of 
h(x, y)  and  Bessel  functions. 

By  (i+.l^)  and  (L.15), 

e'^'^q(x,y)  = E 4 4 f(  ) - f ) ) 

m,n>0  U J 


m n 

"V  (-l)”^^(l+m  - l+n -2 +35  +5 

^ ml  nl  ' ' -^-^mjn  m,n+l‘' 


m >n  >0 


= Uxyi^(xy)  - lxh(-x,  -y)  + Uyh(-x,  -y)  - 1+yi  (xy) 


-2h(-x, -y) +3iQ(xy)  -xi^(xy) 


Afhere  i (z)  = ^ z /k.'(k+r)l  . This  yields  the  steady-state 


< 

■) 

I 


) 

( 

J 

V 

V 

f 


I 

9 

4 

\ 

4 


I 

« 

K 

> 


5.  An  tixplicit  Formula  for  f^(x,  y)  , 

Now  that  the  limiting  behavior  has  been  found,  we  can  look  back 
at  the  original  recurrence  (5.2)  and  see  that  it  does  not  appear  so 
formidable  any  more.  Let  us  define  a sequence  of  polynomials  as  follows: 


(5.1)  PQ(x,y)  = 1 , 

(5.2)  Pj_(x,y)  = y-2x  , 

(5.5)  Pj^+3_(x,y)  = J Pj^(t,y)dt  + j p (x,t)dt  , for  k > 1 . 

0 X 

12  15 

Thus  p^(x,y)  = 2 (^-y)  ^ P^(x,y)  = ^ y^  , etc.;  it  is  easy  to  see 

that  each  term  of  Pj^(x,  y)  has  total  degree  k . 

These  polynomials  handle  the  complicated  parts  of  recurrence  (5.2). 

If  we  assume  that  f^(x,y)  is  a linear  combination  of  the  p’s,  say 

(5.1+)  f (x,y)  = E (p  ^*P^(x,y) 
n n,k  k 

with  q>  „ = 1 , relations  (5.2)  and  (5.3)  imply  that  f ,,(x,y)  also  has 
(J  n+x 

such  a representation,  namely 


fn+l(^’y)  = 2 -2x+ f^(x,y) +y  + 


V 

■ \ >0  % j 


Hence  (5*1+)  holds  for  all  n if  the  coefficients  cp  , satisfy 

ic 

'^n+1,0  ^ ^ ’ 

Vl,k+1  = j(%k+^+%k)  ^ for  n > 0 and  k > 0 . 

Eince  *Po  k “ ^ k > 1 , this  recurrence  is  easy  to  solve,  and  we  have 


(5.C)  cp. 


n,  k 


1 < j <n 


for  n > 0 and  k > 1 . 


I 
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o.  Approach  to  the  Answers. 

We  have  ohown  that  the  trivial  algorithm  leads  to  a (nontrivial) 
limiting  distribution.  What  we  really  want  to  know  is  the  limiting 
probabilities  of  the  various  tree  shapes  that  arise,  namely  the 
quantities  ‘ \ ^ ^n  ^ ®n  'the  Integrals  in  (2.5) 

and  (2.6),  as  n — ^ , 

We  clearly  have 


(6.1)  a +b  +c  +d  +e  = 1 , 

' ^ n n n n n ' 


(6.2)  q+s„ 


f „ + B_  = 1 . 


Furthermore  since  b^(x, y, z)  + d^(x, y, z)  = 2 by  (2.1)  and  (j.l),  we  have 

(6.5)  b +d  = y . 

^ ' n n 5 

Another  relation,  sli^tly  more  subtle,  also  holds.  We  have 

a^  = f f f f (y,  z)dxdydz  = J J xf  (x,y)dxdy  , 
0<x<y<z<l  0<x<y<l 


b = 

n 


Iff  f^(x, z)dxdydz  = f f (y-x)f^(x,y)dxdy 


0<x<y<z<l 


0 <x  <y  <1 


I - e = J J J f (x,y)dxdydz  = J f (l-y)f  (x,y)dxdy 

^ 0<x<y<z<l  0<x<y<l 


Therefore 


(e.h) 


a+b  + --  e = f 
n n 5 n n 


And  still  another  relation,  even  more  subtle,  can  be  obtained  by 
looking  more  closely.  If  we  integrate  both  sides  of  (3.2)  over 
0<x<y<l  we  find 


20 


5 X'  , = ^ + !•  + 
n+1  5 n 


f J 


iL(t,y)dt 


0<x<y<l  i) 


I J S i„(x,t)dt 


0<x<y<l  X 


2 1 
= -+  f + b + i-  e 
5 n n 3 n 


Combining  this  with  (o.l|)  yields  the  somewhat  sxirjrioing  formula 


(^•5) 


a+3f^,  =^+2f 

n n+1  3 n 


U. 


For  example,  we  know  that  \ ^ > f^  = ^ , and  ; everything 

checks  out  beautifully. 

From  relations  (6.1)  - (6.5)  we  can  determine  all  of  a , ...  , e , i'  , p 
kno\,ang  only  the  values  of  b^  and  f^^  for  all  n . Let  us  first  look 
at  f , and  especially  at  the  component  Involving  p,  (x, y)  ; 


(' . ) 


1 1 
j+l  k+2 


(k+2) 


X ^("0(C?)-(S)) 


1 


r(( 


2k-2 

k 


2k- 

k 


S))  • 


Similarly 


('-7) 


oM<i  - n ^(5  )((7j  -(5:0) 


(k+5) 


: ((T)  -(T.n)  ■ 
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k 


These  quantities  are  nonnegative  for  1 k > 0 , and  since  the 
coefficients  rp^  ^ in  (5.^0  and  (5.6)  are  monotone  nondecreasing. 


with  n , it  follows  that 


(6.8) 


f ^-1  > f 
n+l  - n 


and  b > b 
n+l  - n 


for  n > 0 


(a  similar  argument  shows  that  e < e for  al 1 n . ) 

n+x  — n 

Let  us  now  look  at  the  limiting  behavior.  We  have 


f„,(x,y)  = E (x,y) 

k>0  2^  ^ 


by  (5.?)^  hence  by  (6.6)  and  (6.7)  the  probabilities  f^  and  b^ 


increase  to  the  limits 


(6.9) 


f = L 


k>0  2 (k+2); 


= E 


k>0  2 (k+5) 


i 


7 . !.'valuation  ol'  tho  Final  Sum;J. 

'riio  I'onnulas  in  (6.9)  convei-ge  rapidly,  so  we  could  compute  them 
aiid  be  ; but  of  course  we  would  like  to  express  the  result  in  terms 

of  "kiiown"  mathematical-  quantities,  for  if  there  is  a simple  answer 
we  want  to  know  about  it.  In  order  to  get  a cleaner  sum  to  work  with, 
let  us  consider  the  similar  series 


which  converges  absolutely  for  all  x . Differentiation  yields 


(7.2)  s;(x)  = L ^ 

k >1 


(k+r+1 


- (2 (k+r+1)  - (2r+l))^^jJ"  J 


2s^(x)  - (2n+l)s^^(x) 


Thus  if  we  define 


(7.3)  t^(x)  = e"  s^(x)  , 


we  have 


(7.1+)  t^(x)  = - (2r+l)t^j_(x)  . 


According  to  this  relation,  we  obtain  all  t^(x)  by  starting  with  'tQ(^) 
;and  differentiating. 


Ti-  V. 


A curious  thing  happens  when  we  look  at  ^q\^)  • 


-2x  / X (x/l 

e s (x)  = E ^ 

k >0 


:-2xr  ^ 


= E 

m >0 


= E 

m >0 


[-gxf  f m-l/2\ 
mi  m J 


V 


Sq(-x)  , 


using  the  familiar  identities 


(7.5)  (-l)“Cf  ) = ("T)  ' “■"(»)  • 


In  other  words,  ^ ^0^^^  ~ ™ even 

functionl  This  coincidence  deserves  looking  into;  let  us  write 


-X 

e s. 


(x)  = E pk  A ^ X:^ 


• s 


—f—  u 

mi  m 


where 


(7.6) 


After  a f jw  moments  of  playing  with  this  sum,  an  experienced  binomial- 
coefficientologict  might  hit  on  the  following  elementary  method  of  evaluation 


yfiamm  *■ 


U = u ^ 2.  ( 

m m-1  V 

k '• 

m-i 

k-1 

)¥( 

2k  \ 
) 

m-1 

k 

)^( 

2k+2  \ 
k+1  ) 

II 

3*= 

1 

H 

1 

fvM 

m 

k+1 

)^( 

2k  N 2k+l 
k y m 

= U , - L[ 

k V 

m 

k+1 

2k  2 2k+2  ^ ^ 
k y m ^ 

f m N 

1 gk 

= '^m-l  - 2V 

1^ 

(-1)^  f 2k  ' 

J 2^  V . 

)■ 

hence 

= 

("2kN 

2^  U; 

(m-l)(u^_l  + u^_2) 

= 

(-1)^,^  2k  2 
2k  V kyl 

m 'N 

, k+1  j 

Subtracting  these  equations  yields 


mu  = (m-l)u  „ . 

m '■  ' m-2 


How  Uq  = 1 and  u^  = 0 , hence  ^2mfi  " ^ as  we  knew;  and 


(7.7)  u, 


2m-l  2m-3 
'2m  “ 2m  2m-2 


(is  there  a simpler  elementary  proof  of  this  formula?)  We  have  shown  that 


(7.3) 


)(x)  = L 

m >0 


/ ^2m 

(-x) 

(2m)  I ^m 


^ • 


2m 


m >0 


m;  m; 


io(>^)  ; 
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so  our  friend  the  modified  Bessel  function  has  appeared  again.  The 
above  relations  now  yield  the  identities 


/ N 2x  , 2x  (-1)^  d^  / -X  ^ ^ , 

s (x)  = e t (x)  !=  e ; — = (e  I_(x))  , 

r'  ^ ' 1 • 5 • . . . • (2r-l)  ^r  '■  0^  ' ' ’ 


so  that 


(7-9)  s„(x)  = 


e Iq(x)  , 


e (Iq(x)  - I^(x))  , 


S2(x) 


I e^(lQ(x)  -2  I|^(x)  + I^(x))  , 


Sj(x) 


e^'d^Cx)  -3  I^(x) +5  I^(x)  -I^'(x))  , etc. 


It  is  easy  to  see  from  definition  (4.l6)  that 


(7.10)  I^(x)  = I^(x)  , I((x)  = Iq(x)  -x'^  d(x) 


hence  we  can  express  each  s (x)  in  I.erms  of  I^(x)  and  T,  (x)  . 

r Q 


Finally  to  get  f and  b we  need  to  express  the  sums  in  (6.9)  in 


terms  of  s^(x)  for  various  r . The  problem  boils  down  to  expressing 


the  binomial  coefficient 


coefficients  of  the  form 


(=r) 

f 2n+2k  \ 

V ) 


as  a linear  combination  of  binomial 


. For  m = 0 this  is  no  problem, 


and  for  m = 1 we  have 


(T)  - K-i) 


n > 0 . 


For  m > 2 we  can  reduce  the  problem  to  the  cases  m-1  and  m-2  , since 


^ 2n+m  ^ ^ 2n+2+ (m-1 ) ^ _ ^ 2n+2+ (m-2 ) ^ 


Iterating  this  idea  leads  us  to  the  desired  identity, 

(7.11)  ( ) = i „ )( A)  I , forn  > 1 , 


-r./i 


paxticular  we  get 


( 2n-2  \ _ ( 2n-2  \ _ 1 f 2n  \ f 2n-2  1 , 

\ n j n-2  ) ~ 2Vnj‘\^  n-1  J ^ 2 n,  0 ’ 

f 2n-2  ^ _ 1 r '\  ^ + 2 r 2n  '\  f 2n-2  'V  3 

V n-l+  y “ 2 n+2  J ~ n+1  j 2 ^ n / ^ n-1  y ' 2 ®n,0  ’ 

f 2n-l  \ 1 f 2n\  1 

I n j = 2 U J"2^,0  ^ 

f 2n-l  \ if  2n+6  '\  7 ^ 2n+4  \ ^ „f  2n+2  \ 7 ^ 2n  1 

( n-l+  j - 2 ( n+5  J ■ 2 ( n+2  j ^ ( n+1  j ' 2 n j ^ 2 ^n,  C 


for  n > 0 . 


Letting  s stand  for  s (l)  , we  can  now  rewrite  (6.9) 


as 


(7.12) 


2 ^2  “ 2 ^5  " 2 ~ o o ^ 


2 2 2 ^5 


- 1-  2Sq  + 6sj^  -Usg 


= ^ e Iq(1)  -2  el^(l)  -1  ; 


-K 


1 8 Z' 

2 ^5 


) 


s„  - 


= - 3 - ^ iSq  + lUs^  - lits^  + 


= 2 6 1^(1)  - ^ e Ij_(l)  -3 
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The  Bessel  function  values  we  need  are  readily  computed  to  be 


(7.13)  Iq(1) 

1^(1) 


1,26606  58777  52008  33559  821+46  25214  71753  76077  - , 
0.56515  91039  92485  02720  76960  27609  86550  75289  - . 


Finally  therefore  we  have  the  answers: 


(7.14 


(7.14) 


1 - f 
3 “ 

= 0.15049  16196  41488 

= 0.19601  96040  80347 

f - e 

CD  CD 

= 0.35250  55369  95186 

3 ” 

= 0.13731  37292  52935 

1 + b -2f 

CO  CD 

= 0.16366  95100  29991 

= 0.51617  50470  25177 

1 - f 

00 

0.48382  49529  7^822 

77320 

57536 

10505 

75797 

78842 

89347 

10653 


The  average  internal  path  length  of  the  tree  just  before  the  (n+1)  -st 

deletion  is  3a.  +3h  +2c  +3d  +3®  = 3 - c . We  have  proved  that  c 
nn  n^n  n ^ n 

converges  to  , which  is  greater  than  Cq  = ^ ; this  is  consistent 

with  the  conjecture  that  deletions  do  not  make  the  path  length  larger  than 

pure  insertions  do.  However,  it  is  interesting  to  note  that  the  convergence 

of  c to  c is  not  monotonic: 
n a>  
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o 


c 


7 


1 

5 

72 

19 

5^^ 

Il3 

W 

300U 

B5^ 

1132983 

3265^0 

11667107 

15226976 

699791131 

19840I+6400 


0.53333 
0.34722 
0.35185 
0.55509 
0.55320 
0.35303 
0.35285 
0.35271  . 


Therefore  random  deletions  do  not  always  enhance  the  average  path  length; 
the  pattern  IIIDIDIDIDI  leads  to  a better  average  search  time  than  does 
the  same  pattern  followed  by  DI  , and  an  argument  that  does  not  rely  on 
such  monotonicity  will  be  necessary  to  prove  Knott's  conjecture. 
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8.  Modified  Deletions. 


To  complete  our  study  of  this  process  we  shoiild  also  look  at  what 
happens  if  the  "improved"  deletion  algorithm  discussed  on  p.  of  [^]  is 
used.  Here  a new  " step  Dl^  " is  introduced,  to  sin^jlify  the  deletion  of 
nodes  having  an  empty  left  subtree. 

The  modified  algorithm  changes  only  one  thing  with  respect  to  trees 
with  three  or  fewer  nodes:  the  deletion  of  x from  D(x,y, z) 

now  produces  F(y, z)  instead  of  G(y,  z)  . The  net  effect  is  that  the 
integral 

X 

J gjj('t,y)dt 

moves  from  the  sum  for  y)  to  the  sum  for  *^+1^^’^^  (2.5). 

Fortunately  this  change  maXes  the  analog  of  (5.2)  much  simpler  than 
before;  we  now  have 


(8.1)  fQ(x,y)  = 1 


if  y 

fn+i(^^y)  = 5 V ^ ^ J 


) 


for  n > 0 , 

Din-e  (3.1)  remains  valid.  The  relation  corresponding  to  (5.5)  reduces  to 


(8.2)  f„(x,y)  = l + |j  fjx,t)dt  , 


and  by  arguing  as  before  (but  with  considerably  fewer  complications)  we 
can  deduce  the  solution 


(8.5)  f (x,y)  = e 


(y-x)/2 


In  fact,  it  is  not  difficult  to  establish  the  general  formula 


(8.M 


f (x,y)  = L 

0<k<n  k<t<n 
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Since  f^(x,  y)  now  has  such  a simple  form,  we  can  easily  determine 
the  limiting  integrals  corresponding  to  (2.5)  and  (2.6): 


(8.5)  a^ 

r..  8-/e  - 15  = 

0.1897701...  , 

b 

00 

= 20-12V”  = 

O.2153II7...  , 

c 

00 

= 1/5 

0.3353533...  , 

d 

CO 

- 1/3  - . 

0.1179885...  , 

e 

CO 

= 1/5  - a 

O.II3563I...  , 

f 

CO 

k^fe  -6  = 

O.59I885O...  , 

-■  7-1  "JT  = 

O.I05III9...  . 

As  expected,  there  is  now  a stronger  bias  towards  the  F tree.  The  unexpected 
result  is  that  c^  has  such  a simple  form  compared  to  the  others;  in  fact  it 
turns  cut  that 

(8.6)  c^ 

= l/5  for  all  n > 0 , 

so  the  average 
bviilt  up  from 

internal  path  length  is  the  same  as  that  of  a random  tree 
three  insertions!  Eq.  (8.6)  follows  easily  from  (8.1)  and 

the  fact  that 


I 


i 

\ 

i 


f J J '^(y-x)^-  (z-y  )^")dx  dy  dz  = 0 for  k _ 0 . 

0<x<y<z<l 

Since  the  values  of  c^  in  the  unmodified  algorithm  are  greater  than 
1/3  , for  n > 1 , the  average  internal  path  length  actually  turns  out  to 
be  worse  when  we  use  the  "improved"  algorithm.  On  the  other  hand,  Knott's 
empirical  data  in  [2]  indicate  that  the  modified  algorithm  does  indeed  lead 
to  an  improvement  when  the  trees  are  larger. 
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